Introduction¶

Business Context:¶

Trade data are an essential component of a nation's economy. Net Trade, calculated as exports minus imports, is often use to evaluate economic strength. Italy, traditionally a net exporter, reversed this trend in 2022.

Business Question:¶

Investors want to investigate Italy's recent shift to net importer, seeking insights into its overall impact and main cause. They also aim to understand Italy's key export and import markets, if they has changed and have a quick glance of them.

Importing Necessary Libraries¶

In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import geopandas as gpd
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
from jupyter_dash import JupyterDash
import socket

Table of Content¶

  1. Data Import and Pre Processing
  2. Preliminary Data Exploration
  3. Explainatory Data Analysis
  4. Interactive Dashboard

1 - Data Import and Pre Processing¶

Source¶

Source: WITS - World Integrated Trade Solution

WITS - World Integrated Trade Solution

WITS is an online database and analytical tool provided by the World Bank. It offers access to comprehensive international trade and tariff data, allowing users to explore and analyze global trade.

Importing the Dataset¶

World Integrated Trade Solution (WITS) allow you to download up to 10000 rows per download, so to collect all data from 2017 to 2022 we need to download in two trances and merge them toghether.

Required Datasets: These datasets are readily available for download from this Google Drive link.

In [2]:
_df1 = pd.read_csv('Ita_Trade1720.csv', sep=';', encoding='latin-1')
_df2 = pd.read_csv('Ita_Trade2122.csv', sep=';', encoding='latin-1')

df = pd.concat([_df1, _df2], axis=0)

df
Out[2]:
Year TradeFlowName PartnerName PartnerISO3 Partner Region Partner Income Group ProductCode TradeValue in 1000 USD
0 2017 Export Afghanistan AFG South Asia Low income 2 5.094
1 2017 Export Afghanistan AFG South Asia Low income 8 223.128
2 2017 Export Afghanistan AFG South Asia Low income 9 25.942
3 2017 Export Afghanistan AFG South Asia Low income 12 272.858
4 2017 Export Afghanistan AFG South Asia Low income 15 7.879
... ... ... ... ... ... ... ... ...
41354 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 84 13.434
41355 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 85 1.487
41356 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 90 61.217
41357 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 97 160.931
41358 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 99 93.516

123650 rows × 8 columns

In [3]:
print(df.info())
<class 'pandas.core.frame.DataFrame'>
Int64Index: 123650 entries, 0 to 41358
Data columns (total 8 columns):
 #   Column                  Non-Null Count   Dtype  
---  ------                  --------------   -----  
 0   Year                    123650 non-null  int64  
 1   TradeFlowName           123650 non-null  object 
 2   PartnerName             123650 non-null  object 
 3   PartnerISO3             123650 non-null  object 
 4   Partner Region          123650 non-null  object 
 5   Partner Income Group    123650 non-null  object 
 6   ProductCode             123650 non-null  int64  
 7   TradeValue in 1000 USD  123650 non-null  float64
dtypes: float64(1), int64(2), object(5)
memory usage: 8.5+ MB
None

Explaining the Data¶

Year: The year of the trade transaction. It is represented as an integer.
TradeFlowName: The type of trade flow, indicating whether it is an export or import. It is represented as an object (string).
PartnerName: The name of the trading partner country or entity. It is represented as an object (string).
PartnerISO3: The ISO 3166-1 alpha-3 code of the trading partner country. It is represented as an object (string).
Partner Region: The region to which the trading partner country belongs. It is represented as an object (string).
Partner Income Group: The income group classification of the trading partner country. It indicates the economic development level of the country, such as low income, middle income, high income, etc. It is represented as an object (string).
ProductCode: The HS2 code for every group of items. It is represented as an integer.
TradeValue in 1000 USD: The value of the trade transaction in 1000 USD (United States Dollars). It represents the monetary amount of the trade and is represented as a numerical value (float or integer).

The data present no missing value, but some preprocessing is needed.

HS Nomenclature¶

HS code, short for Harmonized System code, is a standardized numeric method of classifying traded products, developed and maintained by the World Customs Organization (WCO). These codes are used internationally to categorize and define commodities for customs, statistical, and regulatory purposes, facilitating global trade.

HS2 nomenclature is the broadest category in the HS code system, where each product is classified into a two-digit code. For instance, all types of fruits and nuts fall under the HS2 category '08'. This high-level classification is used for general trade overviews and comparisons between broad sectors of goods.

Data Pre-Processing¶

  • Trade values will be converted into million of USD
  • Product lables will be added to better understand the product instead of HS2 the code
  • Category will be created to divided the products in groups
In [4]:
# Rename the column 'TradeValue in 1000 USD' as 'TradeValue'
# New values will refer to USD in Millions
df.rename(columns={'TradeValue in 1000 USD': 'TradeValue'}, inplace=True)
df['TradeValue'] /= 1000
In [5]:
# Create a dictionary to map HS2 codes to product labels
hs_code_map = {
    1: "Live animals",
    2: "Edible meat and offal",
    3: "Fish, crustaceans, molluscs",
    4: "Dairy products, eggs, honey",
    5: "Other animal products",
    6: "Plants, bulbs, flowers",
    7: "Vegetables and tubers",
    8: "Fruits and nuts",
    9: "Coffee, tea, spices",
    10: "Cereals",
    11: "Milling products, starches",
    12: "Oil seeds, fruits, grains",
    13: "Gums, resins, saps",
    14: "Plaiting materials, plants",
    15: "Fats, oils, waxes",
    16: "Meat and fish preparations",
    17: "Sugars and confectionery",
    18: "Cocoa and preparations",
    19: "Cereal preparations, pastry",
    20: "Vegetable and fruit preparations",
    21: "Miscellaneous food preparations",
    22: "Beverages, spirits, vinegar",
    23: "Food industry residues, fodder",
    24: "Tobacco and substitutes",
    25: "Salt, sulphur, earths, stone",
    26: "Ores, slag, ash",
    27: "Mineral fuels, oils, waxes",
    28: "Inorganic and organic chemicals",
    29: "Organic chemicals",
    30: "Pharmaceutical products",
    31: "Fertilizers",
    32: "Dyeing extracts, paints",
    33: "Essential oils, perfumery",
    34: "Soap, cleaning preparations",
    35: "Albumin, glues, enzymes",
    36: "Explosives, pyrotechnics",
    37: "Photographic and cinematographic goods",
    38: "Miscellaneous chemical products",
    39: "Plastics and articles",
    40: "Rubber and articles",
    41: "Raw hides, skins, leather",
    42: "Leather articles",
    43: "Furskins and artificial fur",
    44: "Wood and articles",
    45: "Cork and articles",
    46: "Straw, basketware",
    47: "Paper, paperboard",
    48: "Paper and paperboard articles",
    49: "Printed books, newspapers",
    50: "Silk",
    51: "Wool, animal hair, yarn",
    52: "Cotton",
    53: "Vegetable textile fibers",
    54: "Man-made fiber yarn",
    55: "Man-made staple fibers",
    56: "Wadding, nonwovens, twine",
    57: "Carpets, textile floor coverings",
    58: "Special woven fabrics",
    59: "Impregnated, coated textiles",
    60: "Knitted or crocheted fabrics",
    61: "Knitted or crocheted apparel",
    62: "Non-knitted apparel",
    63: "Textile articles, sets, rags",
    64: "Footwear",
    65: "Headgear",
    66: "Umbrellas, canes, whips",
    67: "Feathers, artificial flowers",
    68: "Stone, plaster, cement",
    69: "Ceramic products",
    70: "Glass and glassware",
    71: "Precious stones, metals",
    72: "Iron and steel",
    73: "Iron and steel articles",
    74: "Copper and articles",
    75: "Nickel and articles",
    76: "Aluminium and articles",
    78: "Lead and articles",
    79: "Zinc and articles",
    80: "Tin and articles",
    81: "Other base metals, cermets",
    82: "Tools, implements, cutlery",
    83: "Miscellaneous metal articles",
    84: "Machinery, appliances",
    85: "Electrical machinery, equipment",
    86: "Railway, tramway equipment",
    87: "Vehicles and accessories",
    88: "Aircraft, spacecraft, parts",
    89: "Ships, boats, floating structures",
    90: "Optical, medical instruments",
    91: "Clocks, watches",
    92: "Musical instruments",
    93: "Arms, ammunition",
    94: "Furniture, lamps, prefabs",
    95: "Toys, games, sports requisites",
    96: "Miscellaneous manufactured articles",
    97: "Works of art, antiques",
    99: "Unspecified"
}
In [6]:
# Create product categories
hs_category_dict = {
    "Raw Food": [1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14],
    "Processed Food": [16, 17, 18, 19, 20, 21, 22, 23, 24],
    "Raw Material": [5, 13, 15, 25, 26, 28, 41, 44, 45, 46, 47, 68, 71, 72, 73, 74, 75, 76, 78, 79, 80, 81],
    "Simple Manufacture": [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 42, 43, 48, 49, 56, 57, 58, 59, 60, 65, 66, 67, 69, 82, 83, 91, 92, 94, 95, 96],
    "Advance Manufacture": [70, 84, 85, 86, 87, 88, 89, 90, 93, 97],
    "Clothing Materials": [50, 51, 52, 53, 54, 55],
    "Finished Clothing": [61, 62, 63, 64],
    "Chemicals and Fertiliser": [28, 29, 31],
    "Advanced Chemicals and Medicine": [30],
    "Petroleum and Derivates": [27],
    "Unspecified": [99]
}

hs_category_map = {k: oldk for oldk, oldv in hs_category_dict.items() for k in oldv}
In [7]:
# Add 'Product' column
df['Product'] = df['ProductCode'].map(hs_code_map)

# Add 'Category' column
df['Category'] = df['ProductCode'].map(hs_category_map)

df
Out[7]:
Year TradeFlowName PartnerName PartnerISO3 Partner Region Partner Income Group ProductCode TradeValue Product Category
0 2017 Export Afghanistan AFG South Asia Low income 2 0.005094 Edible meat and offal Raw Food
1 2017 Export Afghanistan AFG South Asia Low income 8 0.223128 Fruits and nuts Raw Food
2 2017 Export Afghanistan AFG South Asia Low income 9 0.025942 Coffee, tea, spices Raw Food
3 2017 Export Afghanistan AFG South Asia Low income 12 0.272858 Oil seeds, fruits, grains Raw Food
4 2017 Export Afghanistan AFG South Asia Low income 15 0.007879 Fats, oils, waxes Raw Material
... ... ... ... ... ... ... ... ... ... ...
41354 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 84 0.013434 Machinery, appliances Advance Manufacture
41355 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 85 0.001487 Electrical machinery, equipment Advance Manufacture
41356 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 90 0.061217 Optical, medical instruments Advance Manufacture
41357 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 97 0.160931 Works of art, antiques Advance Manufacture
41358 2022 Import Zimbabwe ZWE Sub-Saharan Africa Lower middle income 99 0.093516 Unspecified Unspecified

123650 rows × 10 columns

2 - Preliminary Data Exploration¶

In [8]:
# Sum the values of all exports and all imports
df_ImpExpSum = df.groupby(['Year', 'TradeFlowName'])['TradeValue'].sum().reset_index()
# Multiply TradeValue by -1 for imports to make them negative
df_ImpExpSum['TradeValue'] = df_ImpExpSum.apply(lambda row: -row['TradeValue'] if row['TradeFlowName'] == 'Import' else row['TradeValue'], axis=1)

# Group by Year and calculate the sum of TradeValue, which will be net trade
df_trade_net = df_ImpExpSum.groupby('Year')['TradeValue'].sum().reset_index()
# Rename to TradeNet
df_trade_net.rename(columns={'TradeValue': 'TradeNet'}, inplace=True)

display(df_trade_net)
df_trade_net.set_index('Year').plot()
Year TradeNet
0 2017 38556.061103
1 2018 20099.533824
2 2019 36141.377233
3 2020 46044.512406
4 2021 20091.553148
5 2022 -67296.497619
Out[8]:
<AxesSubplot:xlabel='Year'>
In [9]:
# Create a copy of the original DataFrame
_df = df.copy()

# Multiply the import values by -1
_df.loc[_df['TradeFlowName'] == 'Import', 'TradeValue'] *= -1
# Filter data for the years 2020 and 2022
df_filtered = _df[_df['Year'].isin([2020, 2022])]

# Group by Category and Year to aggregate the trade values
df_grouped = df_filtered.groupby(['Category', 'Year'])['TradeValue'].sum().reset_index()

# Pivot the data to have years as columns and trade values as values
df_pivot = df_grouped.pivot(index='Category', columns='Year', values='TradeValue')

# Calculate the price change between 2020 and 2022
df_pivot['PriceChange'] = df_pivot[2022] - df_pivot[2020]
df_pivot['AbsPriceChange'] = abs(df_pivot[2022] - df_pivot[2020])

# Reset the index and set 'Category' as the new index
table = df_pivot.reset_index().set_index('Category')[['PriceChange','AbsPriceChange']]
table = table.sort_values('AbsPriceChange', ascending=False)['PriceChange'].astype(int).to_frame()
table = table.rename(columns={'PriceChange': 'Price Change Million USD'})


table
Out[9]:
Price Change Million USD
Category
Petroleum and Derivates -106141
Raw Material -14933
Advanced Chemicals and Medicine 12708
Chemicals and Fertiliser -7856
Raw Food -7808
Processed Food 4486
Finished Clothing 4177
Advance Manufacture 1863
Unspecified 1822
Clothing Materials -865
Simple Manufacture -794

Preliminary Data Exploration Insights¶

Our exploratory data analysis reveals that, in 2022 Italy transitioned from a net exporter to a net importer. This shift can largely be attributed to costs associated with petroleum and its derivatives. It is evident that recent changes in the global energy market have significantly influenced Italy's trade balance.

3 - Explainatory Data Analysis¶

Preparing the graphs¶

In [10]:
# Split the data at 2020
df_trade_up_to_2020 = df_trade_net[df_trade_net["Year"] <= 2020]
df_trade_from_2021 = df_trade_net[df_trade_net["Year"] >= 2020] 

# Create the traces for TradeNet
trace_up_to_2020 = go.Scatter(
    x = df_trade_up_to_2020["Year"],
    y = df_trade_up_to_2020["TradeNet"],
    mode = 'lines+markers',
    line = dict(color='grey')
)

trace_from_2021 = go.Scatter(
    x = df_trade_from_2021["Year"],
    y = df_trade_from_2021["TradeNet"],
    mode = 'lines+markers',
    line = dict(color='red')
)

# Create a point for the year 2020
trace_2020_point = go.Scatter(
    x = [2020],
    y = df_trade_net[df_trade_net["Year"] == 2020]["TradeNet"],
    mode = 'markers',
    marker = dict(color='red', size=8)
)

# Create a point for the year 2022
trace_2022_point = go.Scatter(
    x = [2022],
    y = df_trade_net[df_trade_net["Year"] == 2022]["TradeNet"],
    mode = 'markers',
    marker = dict(color='red', size=10)
)

data = [trace_up_to_2020, trace_from_2021, trace_2020_point, trace_2022_point]

layout = go.Layout(
    xaxis = dict(title=None, range=[2017, 2022.5]),
    title = 'Italy Net Trade from 2017 to 2022',
    yaxis = dict(title = 'Net Trade, Mil. USD', side = 'right'),
    showlegend=False, 
    
    # Set the background color to transparent
    paper_bgcolor='rgba(0,0,0,0)',
    plot_bgcolor='rgba(0,0,0,0)',
        
    shapes=[dict(type='line',
                 yref='y', y0=0, y1=0,
                 xref='paper', x0=0, x1=1,
                 line=dict(color='rgba(0,0,0,0.7)'),
                ),
    ],
    yaxis_gridcolor='rgba(128,128,128,0.15)',
    
    annotations=[dict(
        xref='paper',
        yref='paper',
        x=0.98,  # this puts it at the far left of the plot
        y=1.04,  # this puts it slightly above the plot
        text="<b>Italy's net trade peaked in 2020 and decline thereafter<br>transitioning to a net importer in 2022",
        showarrow=False,
        align='right',
    )],
)

fig1 = go.Figure(data=data, layout=layout)



# fig1.show() 
print()

In [11]:
# Filter data for the years 2020, 2021, and 2022
df_filtered = df[df['Year'].isin([2020, 2021, 2022])]

# For each year and category, calculate the net trade (exports - imports)
df_grouped = df_filtered.groupby(['Year', 'Category', 'TradeFlowName'])['TradeValue'].sum().reset_index()

# Pivot the data to have 'TradeFlowName' as columns
df_pivot = df_grouped.pivot_table(index=['Year', 'Category'], columns='TradeFlowName', values='TradeValue', fill_value=0).reset_index()

# Calculate the net trade
df_pivot['NetTrade'] = df_pivot['Export'] - df_pivot['Import']

# Order categories by 2022 net trade values
categories_2022 = df_pivot[df_pivot['Year'] == 2022].sort_values('NetTrade')['Category'].unique()

# Reorder the rows based on categories_2022
df_pivot['Category'] = pd.Categorical(df_pivot['Category'], categories=categories_2022, ordered=True)
df_pivot = df_pivot.sort_values('Category')

net_trade_2021 = df_pivot[df_pivot['Year'] == 2021]['NetTrade']
colors_2021 = ['rgb(135,130,135)' if x > -40000 else 'lightcoral' for x in net_trade_2021]


# Create traces
trace_2020 = go.Bar(
    x=df_pivot[df_pivot['Year'] == 2020]['Category'],
    y=df_pivot[df_pivot['Year'] == 2020]['NetTrade'],
    name='2020',
    marker_color='rgb(155,155,155)'
)

trace_2021 = go.Bar(
    x=df_pivot[df_pivot['Year'] == 2021]['Category'],
    y=df_pivot[df_pivot['Year'] == 2021]['NetTrade'],
    name='2021',
    marker_color=colors_2021,
)
    
net_trade_2022 = df_pivot[df_pivot['Year'] == 2022]['NetTrade']
colors_2022 = ['rgb(115,115,115)' if x > -100000 else 'red' for x in net_trade_2022]

trace_2022 = go.Bar(
    x=df_pivot[df_pivot['Year'] == 2022]['Category'],
    y=net_trade_2022,
    name='2022',
    marker_color=colors_2022,
)

layout = go.Layout(
    title='Net Trade by product Category from 2020 to 2022',
    
    # Adjust the position of the xaxis title
    xaxis=dict(title='Category', title_standoff=25),
    
    # Place the yaxis title on the right
    yaxis=dict(title='Net Trade, Mil. USD', side='right'),
    
    barmode='group',  # This makes bars grouped
    
    # Move legend to the right-hand side
    legend=dict(orientation='v'),
    
    annotations=[
        dict(
            x=-0.42,
            y=-32000,
            xref='x',
            yref='y',
            text='2020',
            showarrow=False,
        ),
        dict(
            x=-0.15,
            y=-64000,
            xref='x',
            yref='y',
            text='2021',
            showarrow=False,
        ),
        dict(
            x=0.15,
            y=-139000,
            xref='x',
            yref='y',
            text='2022',
            showarrow=False,
        ),
        dict(
        xref='paper',
        yref='paper',
        x=0.0,  # this puts it at the far left of the plot
        y=1.05,  # this puts it slightly above the plot
        text="<b>The main cause was the recent spike in petrol and derivates prices",
        showarrow=False,
        align='left',
    )],
    
    # Set the background color to transparent
    paper_bgcolor='rgba(0,0,0,0)',
    plot_bgcolor='rgba(0,0,0,0)',
        
    shapes=[dict(type='line',
                 yref='y', y0=0, y1=0,
                 xref='paper', x0=0, x1=1,
                 line=dict(color='rgba(0,0,0,1)'),
                ),
    ],
    yaxis_gridcolor='rgba(128,128,128,0.15)',    
)

# Add traces to the figure
fig2 = go.Figure(data=[trace_2020, trace_2021, trace_2022], layout=layout)


# Update the layout to show the legend and hide the legend items for the bar traces
fig2.update_layout(layout)
fig2.update_traces(showlegend=False, selector=dict(type='bar'))

# fig2.show()
print()

In [12]:
# Filter out rows where Category is 'Petroleum and Derivates'
df_no_petroleum = df[df['Category'] != 'Petroleum and Derivates']
df_only_petroleum = df[df['Category'] == 'Petroleum and Derivates']


# Calculate total net trade for each year
df_total_trade = df.groupby(['Year', 'TradeFlowName'])['TradeValue'].sum().unstack().reset_index()
df_total_trade['NetTrade'] = df_total_trade['Export'] - df_total_trade['Import']

# Calculate net trade for each year excluding 'Petroleum and Derivates'
df_trade_no_petroleum = df_no_petroleum.groupby(['Year', 'TradeFlowName'])['TradeValue'].sum().unstack().reset_index()
df_trade_no_petroleum['NetTrade'] = df_trade_no_petroleum['Export'] - df_trade_no_petroleum['Import']

# Calculate net trade for each year only for 'Petroleum and Derivates'
df_trade_only_petroleum = df_only_petroleum.groupby(['Year', 'TradeFlowName'])['TradeValue'].sum().unstack().reset_index()
df_trade_only_petroleum['NetTrade'] = df_trade_only_petroleum['Export'] - df_trade_only_petroleum['Import']


# Create traces
trace_total = go.Scatter(
    x=df_total_trade['Year'],
    y=df_total_trade['NetTrade'],
    mode='lines+markers+text',
    textposition='top right',
    text=['Total Net Trade'],
    line=dict(color='rgba(111,111,111,0.7)'),
)

trace_no_petroleum = go.Scatter(
    x=df_trade_no_petroleum['Year'],
    y=df_trade_no_petroleum['NetTrade'],
    mode='lines+markers+text',
    textposition='top right',
    text=['Net Trade excl Petroleum and Derivates'],
    line=dict(color='blue'),
)

trace_only_petroleum = go.Scatter(
    x=df_trade_only_petroleum['Year'],
    y=df_trade_only_petroleum['NetTrade'],
    mode='lines+markers+text',
    textposition='top right',
    text=['Petroleum and Derivatives'],
    line=dict(color='rgba(255,0,0,0.4)',),
)


# Create layout
layout = go.Layout(
    title='Net Trade of Italy from 2017 to 2022',
    xaxis=dict(title=None),
    yaxis=dict(title='Net Trade, Mil. USD', side = 'right'),
    showlegend=False,
    
    # Set the background color to transparent
    paper_bgcolor='rgba(0,0,0,0)',
    plot_bgcolor='rgba(0,0,0,0)',
        
    shapes=[dict(type='line',
                 yref='y', y0=0, y1=0,
                 xref='paper', x0=0, x1=1,
                 line=dict(color='rgba(128,128,128,0.5)'),
                ),
    ],
    yaxis_gridcolor='rgba(128,128,128,0.15)',
        
    annotations=[dict(
        xref='paper',
        yref='paper',
        x=0.01,  # this puts it at the far left of the plot
        y=1.08,  # this puts it slightly above the plot
        text="<b>If we exclude petroleum and derivates, net trade remain stable",
        showarrow=False,
        align='right',
    )],
)

# Add traces to the figure and plot
fig3 = go.Figure(data=[trace_total, trace_no_petroleum, trace_only_petroleum], layout=layout)

# fig3.show()
print()

In [13]:
# Filter data for the years 2020, 2021, and 2022, the product being "Petroleum and Derivates", and TradeFlowName being "Import"
df_petroleum = df[(df['Year'].isin([2020, 2021, 2022])) & (df['Category'] == 'Petroleum and Derivates') & (df['TradeFlowName'] == 'Import')]

# Group by Year and PartnerName to get the total TradeValue
df_grouped_petroleum = df_petroleum.groupby(['Year', 'PartnerName'])['TradeValue'].sum().reset_index()

# Get the top 10 importers for the year 2022
top10_2022 = df_grouped_petroleum[df_grouped_petroleum['Year'] == 2022].nlargest(10, 'TradeValue')['PartnerName'].unique()

# Filter original grouped dataset to only include these top 10 countries from 2022
df_top10 = df_grouped_petroleum[df_grouped_petroleum['PartnerName'].isin(top10_2022)].copy()

# Get the order of countries based on their 2022 TradeValue
order_2022 = df_top10[df_top10['Year'] == 2022].sort_values('TradeValue', ascending=False)['PartnerName']

# Create a categorical type with the order for the years 2020, 2021, and 2022
df_top10.loc[:, 'PartnerName'] = pd.Categorical(df_top10['PartnerName'], categories=order_2022, ordered=True)
df_top10 = df_top10.sort_values(['PartnerName', 'Year'])

# Create traces
trace_2020 = go.Bar(
    x=df_top10[df_top10['Year'] == 2020]['PartnerName'],
    y=df_top10[df_top10['Year'] == 2020]['TradeValue'],
    marker_color='grey',
    name='2020',
)

trace_2021 = go.Bar(
    x=df_top10[df_top10['Year'] == 2021]['PartnerName'],
    y=df_top10[df_top10['Year'] == 2021]['TradeValue'],
    marker_color='lightcoral',
    name='2021',
)

trace_2022 = go.Bar(
    x=df_top10[df_top10['Year'] == 2022]['PartnerName'],
    y=df_top10[df_top10['Year'] == 2022]['TradeValue'],
    marker_color='red',
    name='2022',
)

# Create layout
layout = go.Layout(
    title='Top 10 Importers of Petroleum and Derivates Growth from 2020 to 2022',
    xaxis=dict(title=None),
    yaxis=dict(title='Trade Value, Mil. USD'),
    barmode='group',  # This makes bars grouped
    
        
    # Set the background color to transparent
    paper_bgcolor='rgba(0,0,0,0)',
    plot_bgcolor='rgba(0,0,0,0)',
        
    shapes=[dict(type='line',
                 yref='y', y0=0, y1=0,
                 xref='paper', x0=0, x1=1,
                 line=dict(color='rgba(128,128,128,0.5)'),
                ),
    ],
    yaxis_gridcolor='rgba(128,128,128,0.15)',
    
        
    annotations=[
        dict(
            xref='paper',
            yref='paper',
            x=0.98,  # this puts it at the far left of the plot
            y=0.97,  # this puts it slightly above the plot
            text="""<b>Imports of petrol and derivates have increase drastically from all markets""",
            showarrow=False,
            align='right',)],
)

# Add traces to the figure and plot
fig4 = go.Figure(data=[trace_2020, trace_2021, trace_2022], layout=layout)


text = """<b>- The Russian Federation remains the major importer <br>
        - Imports from North Africa, Middle East, and Central Asia has increased drastically <br>
        - LNG (Liquefied Natural Gas) imports from the USA have increased <br>
        - Imports European countries have become more substantial in the mix"""

fig4_text = go.Figure()

fig4_text.add_annotation(
    x=0,   # Adjusted x coordinate for right offset
    y=0.45,   # Adjusted y coordinate for bottom offset
    xref="paper",
    yref="paper",
    text=text,
    showarrow=False,
    font=dict(size=14, color="black"),
    align="left",
    ax=0,
    ay=0,
)

fig4_text.update_layout(
    xaxis=dict(visible=False),
    yaxis=dict(visible=False),
    height=100,  # Adjusted height and width
    paper_bgcolor='white',
    plot_bgcolor='white',
    margin=dict(t=50, b=50, l=50, r=50),  # Adjusted margins
)

# fig4.show()
# fig4_text.show()
print()

In [14]:
# Filter the data for Category "Petroleum and Derivatives" and TradeFlowName "Import"
df_petroleum_imports = df[(df['Category'] == 'Petroleum and Derivates') & (df['TradeFlowName'] == 'Import')]

# Compute imports for 2020
df_2020 = df_petroleum_imports[df_petroleum_imports['Year'] == 2020]
imports_2020 = df_2020.groupby('PartnerISO3')['TradeValue'].sum()
df_imports_2020 = pd.DataFrame({'iso_alpha': imports_2020.index, 'imports': imports_2020.values})

# Create a choropleth for 2020 imports
fig5 = px.choropleth(df_imports_2020, locations='iso_alpha', color='imports',
                     color_continuous_scale='Reds',
                     range_color=(0, 28000),
                     hover_name='iso_alpha',
                     labels={'imports': 'Imports'},
                     title='Imports of Petroleum and Derivatives by Country for 2020')

# Compute imports for 2022
df_2022 = df_petroleum_imports[df_petroleum_imports['Year'] == 2022]
imports_2022 = df_2022.groupby('PartnerISO3')['TradeValue'].sum()
df_imports_2022 = pd.DataFrame({'iso_alpha': imports_2022.index, 'imports': imports_2022.values})

# Create a choropleth for 2022 imports
fig6 = px.choropleth(df_imports_2022, locations='iso_alpha', color='imports',
                     color_continuous_scale='Reds',
                     range_color=(0, 28000),
                     hover_name='iso_alpha',
                     labels={'imports': 'Imports'},
                     title='Imports of Petroleum and Derivatives by Country for 2022')

# fig5.show()
# fig6.show()

print()

In [15]:
# Exclude 'Petroleum and Derivates' from the data
df_excluding_petroleum = df[df['Category'] != 'Petroleum and Derivates']

# Filter data for 2020 and 2022
df_2020_excl = df_excluding_petroleum[df_excluding_petroleum['Year'] == 2020]
df_2022_excl = df_excluding_petroleum[df_excluding_petroleum['Year'] == 2022]

# Compute net trade for 2020 excluding petroleum
exports_2020_excl = df_2020_excl[df_2020_excl['TradeFlowName'] == 'Export'].groupby('PartnerISO3')['TradeValue'].sum()
imports_2020_excl = df_2020_excl[df_2020_excl['TradeFlowName'] == 'Import'].groupby('PartnerISO3')['TradeValue'].sum()
net_trade_2020_excl = exports_2020_excl - imports_2020_excl
df_net_trade_2020_excl = pd.DataFrame({'iso_alpha': net_trade_2020_excl.index, 'net_trade': net_trade_2020_excl.values})

# Compute net trade for 2022 excluding petroleum
exports_2022_excl = df_2022_excl[df_2022_excl['TradeFlowName'] == 'Export'].groupby('PartnerISO3')['TradeValue'].sum()
imports_2022_excl = df_2022_excl[df_2022_excl['TradeFlowName'] == 'Import'].groupby('PartnerISO3')['TradeValue'].sum()
net_trade_2022_excl = exports_2022_excl - imports_2022_excl
df_net_trade_2022_excl = pd.DataFrame({'iso_alpha': net_trade_2022_excl.index, 'net_trade': net_trade_2022_excl.values})

# Create choropleths for 2020 and 2022 excluding petroleum
fig7 = px.choropleth(df_net_trade_2020_excl, locations='iso_alpha', color='net_trade',
                          color_continuous_scale='RdYlBu',
                          range_color=(-60000, 60000),
                          hover_name='iso_alpha',
                          labels={'net_trade':'Net Trade'},
                          title='Net Trade by Country for 2020 (Excluding Petroleum and Derivates)')

fig8 = px.choropleth(df_net_trade_2022_excl, locations='iso_alpha', color='net_trade',
                          color_continuous_scale='RdYlBu',
                          range_color=(-60000, 60000),
                          hover_name='iso_alpha',
                          labels={'net_trade':'Net Trade'},
                          title='Net Trade by Country for 2022 (Excluding Petroleum and Derivates)')




text = """<b>USA and China holds as the major Export and Import Markets. <br>
            Both Countries have grown in value from 2020."""
subtitle_fig2 = go.Figure()

fig8_text = go.Figure()

fig8_text.add_annotation(
    x=0,   # Adjusted x coordinate for right offset
    y=0.45,   # Adjusted y coordinate for bottom offset
    xref="paper",
    yref="paper",
    text=text,
    showarrow=False,
    font=dict(size=14, color="black"),
    align="left",
    ax=0,
    ay=0,
)

fig8_text.update_layout(
    xaxis=dict(visible=False),
    yaxis=dict(visible=False),
    height=100,  # Adjusted height and width
    paper_bgcolor='white',
    plot_bgcolor='white',
    margin=dict(t=5, b=50, l=50, r=50),  # Adjusted margins
)


# fig7.show()
# fig8.show()
# fig8_text.show()
print()

In [16]:
# Filter the data for exports to USA in 2022
df_usa_2022 = df[(df['Year'] == 2022) & 
                  (df['TradeFlowName'] == 'Export') & 
                  (df['PartnerISO3'] == 'USA')]

# Sum the TradeValue for each product category
df_usa_2022_grouped = df_usa_2022.groupby(['Category', 'Product'])['TradeValue'].sum().reset_index()

# Calculate the TradeValue_sum
df_usa_2022_grouped['TradeValue_sum'] = df_usa_2022_grouped.groupby('Category')['TradeValue'].transform('sum')

# Create the Treemap with color based on TradeValue_sum
fig9 = px.treemap(df_usa_2022_grouped, 
                 path=['Category', 'Product'], 
                 values='TradeValue',
                 color='TradeValue',
                 color_continuous_scale='bluyl',
                 color_continuous_midpoint=8000
                )

fig9.update_layout(
    title="USA: Major Export Market, 2022 Treemap",
    width=1000, 
    height=900)

# fig9.show()
print()

In [17]:
# Filter the data for imports from China in 2022
df_china_2022 = df[(df['Year'] == 2022) & 
                   (df['TradeFlowName'] == 'Import') & 
                   (df['PartnerISO3'] == 'CHN')]

# Sum the TradeValue for each product category
df_china_2022_grouped = df_china_2022.groupby(['Category', 'Product'])['TradeValue'].sum().reset_index()

# Calculate the TradeValue_sum
df_china_2022_grouped['TradeValue_sum'] = df_china_2022_grouped.groupby('Category')['TradeValue'].transform('sum')

# Create the Treemap with color based on TradeValue_sum
fig10 = px.treemap(df_china_2022_grouped, 
                 path=['Category', 'Product'], 
                 values='TradeValue',
                 color='TradeValue',
                 color_continuous_scale='reds',
                 color_continuous_midpoint=8000
                )

fig10.update_layout(
    title="China: Major Import Market, 2022 Treemap",
    width=1000, 
    height=900)

# fig10.show()
print()

In [18]:
# Title figure
title_text = 'Italy Trade Analysis'
title_fig = go.Figure()

title_fig.add_annotation(
    x=0.5,
    y=0.5,
    xref='paper',
    yref='paper',
    text=title_text,
    showarrow=False,
    font=dict(size=24, color='black'),
)

title_fig.update_layout(
    title_text='',
    xaxis=dict(visible=False),
    yaxis=dict(visible=False),
    height=100,
    margin=dict(t=10, b=10, l=10, r=10),
)

# Subtitle figure 1
subtitle_text1 = 'Petrol and Derivates Breakdown'
subtitle_fig1 = go.Figure()

subtitle_fig1.add_annotation(
    x=0.5,
    y=0.5,
    xref='paper',
    yref='paper',
    text=subtitle_text1,
    showarrow=False,
    font=dict(size=18, color='black'),
)

subtitle_fig1.update_layout(
    title_text='',
    xaxis=dict(visible=False),
    yaxis=dict(visible=False),
    height=60,
    margin=dict(t=10, b=10, l=10, r=10),
)

# Subtitle figure 2
subtitle_text2 = 'Major Import and Export Markets, Excluding Petrol and Derivates.'

subtitle_fig2.add_annotation(
    x=0.5,
    y=0.5,
    xref='paper',
    yref='paper',
    text=subtitle_text2,
    showarrow=False,
    font=dict(size=18, color='black'),
)

subtitle_fig2.update_layout(
    title_text='',
    xaxis=dict(visible=False),
    yaxis=dict(visible=False),
    height=60,
    margin=dict(t=10, b=10, l=10, r=10),
)


print()

Show the full Visualization¶

In [19]:
title_fig.show()
fig1.show()
subtitle_fig1.show()
fig2.show()
fig3.show()
fig4.show()
fig5.show()
fig6.show()
fig4_text.show()
subtitle_fig2.show()
fig7.show()
fig8.show()
fig8_text.show()
fig9.show()
fig10.show()